Query Expansion for Noisy Legal Documents
نویسندگان
چکیده
The vocabulary of the TREC Legal OCR collection is noisy and huge. Standard techniques for improving retrieval performance such as content-based query expansion are ineffective for such document collection. In our work, we focused on exploiting metadata using blind relevance feedback, iterative improvement from the reference Boolean run, and the effects of using terms from different topic fields for automatic query formulation. This paper describes our methodologies and results.
منابع مشابه
A Generative Blog Post Retrieval Model that Uses Query Expansion based on External Collections
To bridge the vocabulary gap between the user’s information need and documents in a specific user generated content environment, the blogosphere, we apply a form of query expansion, i.e., adding and reweighing query terms. Since the blogosphere is noisy, query expansion on the collection itself is rarely effective but external, edited collections are more suitable. We propose a generative model...
متن کاملQuery expansion based on relevance feedback and latent semantic analysis
Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...
متن کاملA Cross-language Information Retrieval Based on an Arabic Ontology in the Legal Domain
In this paper, we describe a web-based multilingual tool for Arabic information retrieval based on ontology in the legal domain. We illustrate the manual construction of the ontology and the way it is edited using Protégé2000. Using Arabic (UN) documents we identify the legal terms and the semantic relations between them before mapping them onto their position in the ontology. The process of se...
متن کاملOn the importance of Legal Catchphrases in Precedence Retrieval
This paper presents our working notes for FIRE 2017, Information Retrieval from Legal documents -Task 2 (Precedence retrieval). Common Law Systems around the world recognize the importance of precedence in Law. In making decisions, Judges are obliged to consult prior cases that had already been decided to ensure that there is no divergence in treatment of similar situations in different cases. ...
متن کاملInteractive Query Refinement for Boolean Search
Boolean search is still the method of choice for many kinds of professional search, such as constructing systematic reviews in legal and medical fields. It is effective for fast, high-recall document classification. Its drawback is the difficulty in crafting a Boolean query that captures semantically relevant documents. Ambiguous search terms lead to the inclusion of non-relevant documents. We ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008